22 research outputs found
Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB+
To monitor critical infrastructure, high quality sensors sampled at a high
frequency are increasingly used. However, as they produce huge amounts of data,
only simple aggregates are stored. This removes outliers and fluctuations that
could indicate problems. As a remedy, we present a model-based approach for
managing time series with dimensions that exploits correlation in and among
time series. Specifically, we propose compressing groups of correlated time
series using an extensible set of model types within a user-defined error bound
(possibly zero). We name this new category of model-based compression methods
for time series Multi-Model Group Compression (MMGC). We present the first MMGC
method GOLEMM and extend model types to compress time series groups. We propose
primitives for users to effectively define groups for differently sized data
sets, and based on these, an automated grouping method using only the time
series dimensions. We propose algorithms for executing simple and
multi-dimensional aggregate queries on models. Last, we implement our methods
in the Time Series Management System (TSMS) ModelarDB (ModelarDB+). Our
evaluation shows that compared to widely used formats, ModelarDB+ provides up
to 13.7 times faster ingestion due to high compression, 113 times better
compression due to the adaptivity of GOLEMM, 630 times faster aggregates by
using models, and close to linear scalability. It is also extensible and
supports online query processing.Comment: 12 Pages, 28 Figures, and 1 Tabl
Time Series Management Systems:A Survey
The collection of time series data increases as more monitoring and
automation are being deployed. These deployments range in scale from an
Internet of things (IoT) device located in a household to enormous distributed
Cyber-Physical Systems (CPSs) producing large volumes of data at high velocity.
To store and analyze these vast amounts of data, specialized Time Series
Management Systems (TSMSs) have been developed to overcome the limitations of
general purpose Database Management Systems (DBMSs) for times series
management. In this paper, we present a thorough analysis and classification of
TSMSs developed through academic or industrial research and documented through
publications. Our classification is organized into categories based on the
architectures observed during our analysis. In addition, we provide an overview
of each system with a focus on the motivational use case that drove the
development of the system, the functionality for storage and querying of time
series a system implements, the components the system is composed of, and the
capabilities of each system with regard to Stream Processing and Approximate
Query Processing (AQP). Last, we provide a summary of research directions
proposed by other researchers in the field and present our vision for a next
generation TSMS.Comment: 20 Pages, 15 Figures, 2 Tables, Accepted for publication in IEEE TKD